AITopics | transposable mask

Collaborating Authors

transposable mask

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

b0490b85e92b64dbb5db76bf8fca6a82-Supplemental.pdf

Neural Information Processing SystemsFeb-10-2026, 17:43:53 GMT

adaprune, algorithm, transposable mask, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.69)

Add feedback

b0490b85e92b64dbb5db76bf8fca6a82-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 17:43:49 GMT

accuracy, neural network, sparsity, (15 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Israel > Haifa District > Haifa (0.04)
North America > United States (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks

Neural Information Processing SystemsDec-24-2025, 18:30:45 GMT

Unstructured pruning reduces the memory footprint in deep neural networks (DNNs). Recently, researchers proposed different types of structural pruning intending to reduce also the computation complexity. In this work, we first suggest a new measure called mask-diversity which correlates with the expected accuracy of the different types of structural pruning. We focus on the recently suggested N:M fine-grained block sparsity mask, in which for each block of M weights, we have at least N zeros. While N:M fine-grained block sparsity allows acceleration in actual modern hardware, it can be used only to accelerate the inference phase.

accelerated sparse neural training, name change, provable and efficient method, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.58)

Add feedback

b0490b85e92b64dbb5db76bf8fca6a82-Supplemental.pdf

Neural Information Processing SystemsAug-16-2025, 21:18:59 GMT

algorithm, artificial intelligence, transposable mask, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.97)

Add feedback

b0490b85e92b64dbb5db76bf8fca6a82-Paper.pdf

Neural Information Processing SystemsAug-16-2025, 21:18:56 GMT

machine learning, natural language, sparsity, (19 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Israel > Haifa District > Haifa (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks

Neural Information Processing SystemsJan-18-2025, 17:39:20 GMT

accelerated sparse neural training, provable and efficient method, transposable mask, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.60)

Add feedback

Accelerating Transformer Pre-training with 2:4 Sparsity

Hu, Yuezhou, Zhao, Kang, Huang, Weiyu, Chen, Jianfei, Zhu, Jun

arXiv.org Artificial IntelligenceMay-27-2024

Training large transformers is slow, but recent innovations on GPU architecture give us an advantage. NVIDIA Ampere GPUs can execute a fine-grained 2:4 sparse matrix multiplication twice as fast as its dense equivalent. In the light of this property, we comprehensively investigate the feasibility of accelerating feed-forward networks (FFNs) of transformers in pre-training. First, we define a ``flip rate'' to monitor the stability of a 2:4 training process. Utilizing this metric, we propose three techniques to preserve accuracy: to modify the sparse-refined straight-through estimator by applying the masked decay term on gradients, to determine a feasible decay factor in warm-up stage, and to enhance the model's quality by a dense fine-tuning procedure near the end of pre-training. Besides, we devise two techniques to practically accelerate training: to calculate transposable 2:4 masks by convolution, and to accelerate gated activation functions by reducing GPU L2 cache miss. Experiments show that our 2:4 sparse training algorithm achieves similar convergence to dense training algorithms on several transformer pre-training tasks, while actual acceleration can be observed on different shapes of transformer block apparently. Our toolkit is available at https://github.com/huyz2023/2by4-pretrain.

accelerating transformer pre-training, gradient, transformer, (14 more...)

arXiv.org Artificial Intelligence

2404.01847

Country:

Europe > Austria > Vienna (0.14)
Asia > China (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks

Hubara, Itay, Chmiel, Brian, Island, Moshe, Banner, Ron, Naor, Seffi, Soudry, Daniel

arXiv.org Artificial IntelligenceFeb-16-2021

Recently, researchers proposed pruning deep neural network weights (DNNs) using an $N:M$ fine-grained block sparsity mask. In this mask, for each block of $M$ weights, we have at least $N$ zeros. In contrast to unstructured sparsity, $N:M$ fine-grained block sparsity allows acceleration in actual modern hardware. So far, this was used for DNN acceleration at the inference phase. First, we suggest a method to convert a pretrained model with unstructured sparsity to a $N:M$ fine-grained block sparsity model, with little to no training. Then, to also allow such acceleration in the training phase, we suggest a novel transposable-fine-grained sparsity mask where the same mask can be used for both forward and backward passes. Our transposable mask ensures that both the weight matrix and its transpose follow the same sparsity pattern; thus the matrix multiplication required for passing the error backward can also be accelerated. We discuss the transposable constraint and devise a new measure for mask constraints, called mask-diversity (MD), which correlates with their expected accuracy. Then, we formulate the problem of finding the optimal transposable mask as a minimum-cost-flow problem and suggest a fast linear approximation that can be used when the masks dynamically change while training. Our experiments suggest 2x speed-up with no accuracy degradation over vision and language models. A reference implementation can be found at https://github.com/papers-submission/structured_transposable_masks.

accelerated sparse neural training, provable and efficient method, sparsity, (14 more...)

arXiv.org Artificial Intelligence

2102.08124

Country:

Asia > Middle East > Israel > Haifa District > Haifa (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback